k-mer Profiling for Bacterial Identification
نویسندگان
چکیده
منابع مشابه
Bacterial population assay via k-mer analysis
Identifying and assaying the relative abundance of members of complex microbial communities is an important problem in ecology. Sandberg et al. investigated the usage of genomic signatures to provide high identification percentages from short sequence samples. In this paper we present an improved naive Bayesian classification method using conditional probabilities, which can be used to classify...
متن کاملMetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation
Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 50) to fit a k-mer "palette" of a given sample to the k-mer palette of reference organisms. By mod...
متن کاملStatistics for K-mer Based Splicing Analysis
It is well acknowledged that alternative splicing module plays a crucial role to identify the variations of the RNA transcriptomes. In high-throughput short-read RNA, splicing analysis is a challenging task due to the uncertainty and time complexity of reads alignments onto genome and transcriptome. In this paper, we introduce k-mer based statistical method for splicing event analysis. The k-me...
متن کاملSEK: sparsity exploiting k-mer-based estimation of bacterial community composition
MOTIVATION Estimation of bacterial community composition from a high-throughput sequenced sample is an important task in metagenomics applications. As the sample sequence data typically harbors reads of variable lengths and different levels of biological and technical noise, accurate statistical analysis of such data is challenging. Currently popular estimation methods are typically time-consum...
متن کاملCompact Universal k-mer Hitting Sets
We address the problem of finding a minimum-size set of k-mers that hits L-long sequences. The problem arises in the design of compact hash functions and other data structures for efficient handling of large sequencing datasets. We prove that the problem of hitting a given set of L-long sequences is NP-hard and give a heuristic solution that finds a compact universal k-mer set that hits any set...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: HELIX
سال: 2018
ISSN: 2277-3495,2319-5592
DOI: 10.29042/2018-4007-4009